HALF-TREK CRITERION FOR GENERIC IDENTIFIABILITY OF 
LINEAR STRUCTURAL EQUATION MODELS 

RINA FOYGEL, JAN DRAISMA, AND MATHIAS DRTON 

^_^ Abstract. A linear structural equation model relates random variables of 

f— ^ interest and corresponding Gaussian noise terms via a linear equation system. 

^sj Each such model can be represented by a mixed graph in which directed edges 

encode the linear equations, and bidirected edges indicate possible correlations 
i—^ among noise terms. We study parameter identifiability in these models, that 

!_— is, we ask for conditions that ensure that the edge coefficients and correlations 

appearing in a linear structural equation model can be uniquely recovered 
from the covariance matrix of the associated normal distribution. We treat 
the case of generic identifiability, where unique recovery is possible for almost 
every choice of parameters. We give a new graphical criterion that is sufficient 
for generic identifiability. It improves criteria from prior work and does not 
require the directed part of the graph to be acyclic. We also develop a related 
necessary condition and examine the "gap" between sufficient and necessary 
C^ conditions through simulations as well as exhaustive algebraic computations 

^~* for graphs with up to five nodes. 
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y—i 1. Introduction 

psi When modeling the joint distribution of a random vector X = (Xi, . . . , X m ) T , 

l/~) it is often natural to appeal to noisy functional relationships. In other words, each 

lO variable X w is assumed to be a function of the remaining variables and a stochastic 

noise term e w . The resulting models are known as linear structural equation models 

when the relationship is linear, that is, when 

(1.1) X w = \ 0w + 22\ vw X v + e w , w = l,...,m, 

or, in vectorized form with a matrix A = (X vw ) that is tacitly assumed to have 
zeros along the diagonal, 

(1.2) X = X a +A T X + e. 

The classical distributional assumption is that the error vector e = (ei, . . . , e m ) 
has a multivariate normal distribution with zero mean and some covariance matrix 
tt = (uj vw ). Writing / for the identity matrix, it follows that X has a multivariate 
normal distribution with mean vector (/ — A)~ T Ao and covariance matrix 

(1.3) Y 1 = {I-K)- T n{I-A)- 1 . 

Background on structural equation modeling can be found, for instance, in JBol89]. 
As emphasized in SGSOO, PeaOO], their great popularity in applied sciences is due 
to the natural causal interpretation of the involved functional relationships. 
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CD — Ki^uS® 

Figure 1. Mixed graph for the instrumental variable model. 

Interesting models are obtained by imposing some pattern of zeros among the 
coefficients X vw and the covariances ui vw . It is convenient to think of the zero 
patterns as being associated with a mixed graph that contains directed edges v — > w 
to indicate possibly non-zero coefficients X vw and bidirected edges v f-> w when uj vw 
is a possibly non-zero covariance; in figures we draw the bidirected edges dashed 
for better distinction. Mixed graph representations have first been advocated in 
|Wri21[ IWri34j and are also known as path diagrams. We briefly illustrate this in 
the next example, which gives the simplest version of what are often referred to as 
instrumental variable models; see also [DMSlOj . 

Example 1 (IV). Suppose that, as in [ER99 , we record an infant's birth weight 
(X3), the level of maternal smoking during pregnancy (X2), and the cigarette tax 
rate that applies (X\). A model of interest, with mixed graph in FigureYn assumes 

X\ = A01 + £l, X2 = A02 + A12-X1 + C2, A3 = A03 + A23X2 + £3, 

with an error vector e that has zero mean vector and covariance matrix 

/wn 

fi = I W22 ^23 

\ W 2 3 ^33y 

The possibly non-zero entry W23 can absorb the effects that unobserved confounders 
(such as age, income, genetics, etc.) may have on both X 2 and X^; compare RS02, 
IWerllj for background on mixed graph representations of latent variable problems. 

Formally, a mixed graph is a triple G — (V, D, B) where V is a finite set of nodes 
and D,B C V x V are two sets of edges. In our context, the nodes correspond to 
the random variables X\, . . . , X m , and we simply let V = [to] := {1, . . . , to}. The 
pairs (v,w) in the set D represent directed edges and we will always write v —¥ w; 
v — > w G D does not imply w — > v G D. The pairs in B are bidirected edges v <-$■ w; 
they have no orientiation, that is, v ■(-> w G B if and only if w ++ v G B. Neither 
the bidirected part (V, B) nor the directed part (V, D) contain self-loops, that is, 
V — > v $ D and v -f-> v $ B for all v G V. If the directed part (V, D) does not 
contain directed cycles (that is, no cycle v —>•••—> v can be formed from the edges 
in D), then the mixed graph G is said to be acyclic. 

Let M. D be the set of real to x m-matrices A = (X vw ) with support D, that is, 
X vw =0ifw— > w ^ D. Write M.® for the subset of matrices A G M. D for which / — A 
is invertible, where I denotes the identity matrix. (If G is acyclic, then M. D = M^„; 
see the remark after equation (2.3).) Similarly, let PD m be the cone of positive 
definite symmetric to, x to- matrices Q — (uj vw ) and define PD(B) C PD m to be the 
subcone of matrices with support B, that is, u> vw = if v 7^ w and v ■(-> w $jL B. 

Definition 1. The linear structural equation model given by a mixed graph G — 
(V,D,B) on V = [m] is the family of all m-variate normal distributions with co- 
variance matrix 

S = (/-A)- T 0(/-A)- 1 
for A G R£, and « G PD{B). 
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The first question that arises when specifying a linear structural equation model 
is whether the model is identifiable in the sense that the parameter matrices A G 
M^L and ft G PD{B) can be uniquely recovered from the normal distribution they 
define. Clearly, this is equivalent to asking whether they can be recovered from the 
distribution's covariance matrix, and thus we ask whether the fiber 

(1.4) T{k, ft) = {(A', ft') G 6 : G (A', ft') = G (A, ft)} 

is equal to {(A, ft)}. Here, we introduced the shorthand 6 := R® g x PD(B). Put 
differently, idcntifiability holds if the parametrization map 

(1.5) 4> G : (A,ft)^ (/-A)- T ft(/-A)- 1 
is injective on 0, or a suitably large subset. 

Example 2 (IV, continued). In the instrumental variable model associated with 
the graph in Figure [71 

^ = \0~vw) = 





Wn WllAl2 W11A12A23 

OJ n A 12 W 2 2+ w llA 12 W23 + A23CT22 

^WnAi2A23 W23 + A23CT22 W33 + 2W23A23 + A 23 0'22 / 

Despite the presence of both the edges 2 — > 3 and 2 -f-> 3, we can recover A (and 

thus also ft,) from £ using that 

, _ CT12 , _ 0-13 
A12 — , A23 — • 

0"11 0"12 

The first denominator cr n is always positive since T, is positive definite. The second 
denominator an is zero if and only if \yi = 0. In other words, if the cigarette tax 
(X\) has no effect on maternal smoking during pregnancy (X2), then there is no 
way to distinguish between the causal effect of smoking on birth weight (coefficient 
A23J and the effects of confounding variables (coefficient W23J. Indeed the map <f>Q 
is injective only on the subset of O with A 12 7^ 0. 

In this paper we study the kind of identifiability encountered in the instru- 
mental variables example. The statistical literature often refers to this as almost- 
everywhere identifiability to express that the exceptional pairs (A, ft) with fiber 
cardinality | .F(A, ft)| > 1 form a set of measure zero. However, since the map 4>q is 
rational, the exceptional sets are well-behaved null sets, namely, they are algebraic 
subsets. An algebraic subset V C O is a subset that can be defined by polynomial 
equations, and it is a proper subset of the open set unless it is defined by the zero 
polynomial. A proper algebraic subset has smaller dimension than (see jCLO07| ). 
and thus also measure zero; statistical work often quotes the lemma in [Oka73| for 
the latter fact. These observations motivate the following definition and problem. 

Definition 2. The mixed graph G is said to be generically identifiable if <f>Q is 
injective on the complement \ V of a proper (i.e., strict) algebraic subset V <Z Q. 

Problem 1. Characterize the mixed graphs G that are generically identifiable. 

Despite the long history of linear structural equation models, the problem just 
stated remains open, even when restricting to acyclic mixed graphs. However, in 
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the last two decades a number of graphical conditions have been developed that 
are sufficient for generic identifiability. We refer the reader in particular to [PeaOOj , 
BP02b , [BP06 , [Tia09], and |CK10j . which each contain many further references. 
To our knowledge, the condition that is of most general nature and most in the 
spirit of attempting to solve Problem [l| is the G-criterion of [BP06J . This criterion, 
and in fact all other mentioned work, uses linear algebraic techniques to solve 
the parametrized equation systems that define the fibers .F(A, 17). Therefore, the 
G-criterion is in fact sufficient for the following stronger notion of identifiability, 
which we have seen to hold for the graph from Figure [T| recall the formulas given 
in Example [2] 

Definition 3. The mixed graph G is said to be rationally identifiable if there exists a 
proper algebraic subset Vc9 and a rational map ip such that i/jo(J)q(A, 17) = (A, 17) 
for all (A, 17) 6 9 \ V. 

The main results of our paper give a graphical condition that is sufficient for 
rational identifiability and that is strictly stronger than the G-criterion of BP06 
when applied to acyclic mixed graphs. However, the new condition, which we name 
the half-trek criterion, is also applicable to cyclic graphs, for which little prior work 
exists. The approach we take also yields a necessary condition, or more precisely 
put, a graphical condition that is sufficient for G (or rather the map c/>q) to be 
generically infinite-to-one. That is, the condition implies that the fiber .F(A, 17) is 
infinite for all pairs (A, 17) outside a proper algebraic subset of 0. If |J-"(A, 17) | = h 
outside a proper algebraic subset, then we say that G is generically ft,-to-one. 

Our main results just described are stated in detail in Section [3] and proven in 
Section [8] and [9j The comparison to the G-criterion is made in Section |4j with some 
proofs deferred to Section |f0| Some interesting examples are visited in Section [5] 
Those include examples that do not seem to be covered by any known graphical 
criterion. These examples were found as part of an exhaustive study of the identi- 
fiability properties of all mixed graphs with up to 5 nodes. The study is based on 
techniques from computational algebraic geometry |ULO07j . The results together 
with simulations for graphs with 6 and 7 nodes are given in Section [6j In Section [7J 
we describe how our new half-trek behaves with respect to a graph decomposition 
technique for acyclic mixed graphs that is due to [Tia051 . Concluding remarks are 
given in Section [12] 

2. Preliminaries on treks 

A path from node v to node w in a mixed graph G — (V, D, B) is a sequence of 
edges, each from either D or B, that connect the consecutive nodes in a sequence 
of nodes beginning at v and ending in w. We do not require paths to be simple or 
even to obey directions, that is, a path may include a particular edge more than 
once, the nodes that are part of the edges need not all be distinct, and directed 
edges may be traversed in the wrong direction. A path tt from v to w is a directed 
path if all its edges are directed and pointing to w, that is, it is of the form 

v = vq — > Vi — > ■ ■ ■ — > v r = w. 

In a covariance matrix in a structural equation model, that is, a matrix structured 
as in Definition [T] the entry a vw is a sum of terms that correspond to certain paths 
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from v to w. For instance, in Example [2j the variance 

(2.1) (T33 = W33 + W23A23 + ^23^23 + A23CJ22 + A 23 A 12 Cl'ii 

is a sum of five terms that are associated with the trivial path 3, which has no 
edges, and the four additional paths 

3 ^2^3, 3^2^ 3, 3^2^ 3, 3^2^1^2^3. 

In the literature, the paths that contribute to a covariance are known as treks; com- 
pare, e.g., [STD10J and the references therein. A trek from 'source' v to 'target' w is 
a path from v to w whose consecutive edges do not have any colliding arrowheads. 
In other words, a trek from v to w is a path of one of the two following forms: 

vf <- vl_ x i «- vf «- t# <— ► l# -> vf -> . . . -> «?_! -> V* 

or u, L <- u^ <-•••■<- Vi < u T ► uf -¥ . . . -¥ vf_ t -* vf, 



where the endpoints are vf — v, vf = w. In the first case, we say that the left-hand 
side of w, written Left (n), is the set of nodes {vq , v\ , . . . , vf}, and the right-hand 
side, written Right (tt), is the set of nodes {vf,vf, . . . ,vf}. In the second case, 
Left (tt) = {v T , v\,..., vf}, and Right (tt) = {v T , vf,..., vf} — note that the 'top' 
node v T is part of both sides of the trek. As pointed out before, paths and in 
particular treks are not required to be simple. A trek tt may thus pass through a 
node on both its left- and right-hand sides. If the graph contains a cycle, then the 
left- or right-hand side of n may contain this cycle. A trek from v to v may have 
no edges, in which case v is the top node, and Left (tt) = Right (tt) = {v}, and we 
call the trek trivial. 

A trek is obtained by concatenating two directed paths at a common top node 
or by joining them with a bidirected edge, and the connection between the matrix 
entries and treks is due to the fact that 

(2-2) ((I-A)-% w = J2 II x *y> 



where V(v,w) is the set of directed paths from v to w in G. The equality in (2.2) 
follows by writing (/ — A) -1 = / + A + A 2 + . . . . For a precise statement about 
the form of the covariance matrix E, let T(v,w) be the set of all treks from v to 
to. For a trek tt that contains no bidirected edge and has top node v, define a trek 
monomial as 

7r(A,w) = U vv Yl -W 

x— >y£7t 

For a trek tt that contains a bidirected edge v ■(-> w, define the trek monomial as 

tt(X,uj) = u) vw || X xy . 

x^y^TT 

Then following rule [SGSOO. lWri21l IWri34| expresses the covariance matrix E as a 



summation over treks; compare the example in (2.1 ) 



Trek Rule. The covariance matrix E for a mixed graph G is given by 






(2.3) a vw = ) tt(X,oj) 



7TGT(v,w) 
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Figure 2. An acyclic mixed graph. 



We remark that if G is acyclic then A k — for all k > m, and so the expression 



in (2.2) is polynomial. Similarly, ( |2.3[ ) writes a vw as a polynomial. If G is cyclic, 
then one obtains power series that converge if the entries of A are small enough. 
However, in the proofs of Section [8] it will also be useful to treat these as formal 
power series. 

Our identifiability results involve conditions that refer to paths that we term 
half-treks. A half-trek n is a trek with |Left (tt)\ = 1, meaning that it is of the form 

v o ** v o' ~* v i '-* ■ ■ ■ ~* v r-i -* v f 
or v T -> vf -t . . . -» v^_i -> vf. 

Example 3. In the graph shown in Figure^ 

(a) neither 7Ti : 2 — > 3 — > 4 <— 3 nor 7r 2 : 3 — > 4 <R- 1 are treks, due to the 
colliding arrowheads at node 4. 

(b) 7r:2-(— 1 •<-> 4 — > 5 is a trek, but not a half-trek. Left (n) — {1,2} and 
Right(7r) = {4,5}. 

(c) 7r : 1 — >• 2 — > 3 is a half-trek with Left (7r) = {1} and Right (tt) = {1, 2, 3}. 

It will also be important to consider sets of treks. For a set of n treks, II = 
{ni, . . . ,TT n }, let Xi and yt be the source and the target of in, respectively. If the 
sources are all distinct, and the targets are all distinct, then we say that II is a 
system of treks from X = {x l5 . . . , x n } to Y = {y%, . . . , y n }, which we write as 
II : X =^ Y. Note that there may be overlap between the sources in X and the 
targets in Y, that is, we might have X n Y ^ 0. The system II is a system of 
half-treks if every trek iii is a half-trek. Finally, a set of treks II = {m, . . . ,7T n } has 
no sided intersection if 

Left (7T{) H Left (tt,) = = Right (tt*) n Right (wj) Vi ^ j . 

Example 4. Consider again the graph from Figure[R 

(a) The pair of treks 

n 1 : 3 ->• 4 ->• 5, 7r 2 : 4 O 1 

forms a system of treks II = 1^!,^} between X = {3,4} and Y = {1,5}. 
The node 4 appears in both treks, but is in only the right-hand side of m 
and only the left-hand side of ~ki- Therefore, II has no sided intersection. 

(b) The set II = {7Ti,7T2} comprising the two treks 

7Ti : 1 -o- 4, 7T 2 : 3 -> 4 ->• 5. 

is a system of treks between X — {1, 3} and Y = {4, 5}. Since node 4 is in 
Right (tti) n Right (^2), i/ie system H has a sided intersection. 
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3. Main identifiability and non-identifiability results 

Define the set of parents of a node v £ V as P(v) = {w : w — > v G D} and 
the set of siblings as S(v) = {w : W -H- V € -B}. Let -ff(v) be the set of nodes in 
V\ ({w}US r (w )) that can be reached from v via a half-trek. These half-treks contain 
at least one directed edge. Put differently, a node w ^ v that is not a sibling of v is 
in H (v) if w is a proper descendant of v or one of its siblings. The term 'descendant' 
is commonly used to refer to a node that can be reached by a directed path. 

Definition 4. A set of nodes Y c V satisfies the half-trek criterion with respect to 
node v £ V if 

a) \y\ = \p(v)\, 

(ii) Yf]{{v}US(v)) = 0, and 
(hi) there is a system of half-treks with no sided intersection from Y to P{v). 

We remark that if P(v) = 0, then Y = satisfies the half-trek criterion with 
respect to v. We are now ready to state the main results of this paper. 

Theorem 1 (HTC-identifiability). Let (Y v : v E V) be a family of subsets of the 
vertex set V of a mixed graph G. If, for each node v, the set Y v satisfies the half-trek 
criterion with respect to v, and there is a total ordering -< on the vertex set V such 
that w -< v whenever w € Y v D H(v), then G is rationally identifiable. 

Note that the existence of such a total ordering is equivalent to the condition 
that the relation w € Y v tlH(v) does not admit cycles; given the family (5^ : v E V) 
this can be tested in polynomial time in the size of the graph. However, we do not 
know whether the existence of a family (Y v : v € V), with Y v satisfying the half-trek 
criterion with respect to v for each node v, can be checked in polynomial time. 

Theorem 2 (HTC-non- identifiability) . Suppose G is a mixed graph in which every 
family (Y v : v € V) of subsets of the vertex set V either contains a set Y v that 
fails to satisfy the half-trek criterion with respect to v or contains a pair of sets 
(Y V ,Y W ) with v £ Y w and w € Y v . Then the parametrization 4>q is generically 
infinite-to-one. 

The main ideas underlying the two results are as follows. Under the conditions 
given in Theorem [T] it is possible to recover the entries in the matrix A, column- 
by-column, following the given ordering of the nodes. Each column is found by 
solving a linear equation system that can be proven to have a unique solution. The 
details of these computations are given in Section [HI where we prove Theorem [T] 
The proof of Theorem [2] is also in Section [8j and rests on the fact that under the 
given conditions the Jacobian of 4>g cannot have full rank. 

In light of the two theorems we refer to a mixed graph G as 

(i) HTC-identihable, if it satisfies the conditions of Theorem fl] 

(ii) HTC-infinite-to-one, if it satisfies the conditions of Theorem [21 

(hi) HTC-classifiable, if it is either HTC-identifiable or HTC-infinite-to-one, 

(iv) HTC-inconclusive, if it is not HTC-classifiable. 

We now give a first example of an HTC-identifiable graph. Additional examples will 
be given in Section [5j where we will see graphs that are generically /i-to-one with 
2 < h < oo, but also that HTC-inconclusive graphs may be rationally identifiable 
or generically infinite-to-one. 



8 RINA FOYGEL, JAN DRAISMA, AND MATHIAS DRTON 

Example 5. The graph in Figure [J] is ETC -identifiable, which can be shown as 
follows. Let 

y 1 = 0, r 2 = {5}, y 3 = {2}, r 4 = {2}, r 5 -{3}. 

Then each Y v satisfies the half-trek criterion with respect to v because 

(a) trivially, P(v) = for v = 1; 

(b) for v = 2, we have 5 <-> 1 — > 2; 

(c) /or u = 3, we /lave 2 — > 3; 

(d) for v = 4, we /lave 2 — > 3 — > 4; and 

(e) for v = 5, we have 3 — > 4 — > 5. 
Considering the descendant sets H(v), we find that 

Y x n ff(i) = 0, r 2 n H{2) = {5}, r 3 n ff(3) = 0, 

y 4 n ff(4) = {2}, r 5 n ff(5) - {3} . 

Hence, any ordering -< respecting 3 -< 5 -< 2 -< 4 wi// satisfy the conditions of 
Theorem [7J 

A mixed graph G = (V, _D, B) is simple if there is at most one edge between any 
pair of nodes, that is, if DOB — and v — >• w G -D implies w —> v ^ D. As observed 
in |BP02b| . simple acyclic mixed graphs are rationally identifiable; compare also 
Corollary 3 in [DFSllj . It is not difficult to see that Theorem fl] includes this 
observation as a special case. 

Proposition 1. If G is a simple acyclic mixed graph, then G is BTC -identifiable. 

Proof. Since G is simple, it holds for every node v £ V that P(v) n S(v) — and, 
thus, P(v) satisfies the half-trek criterion with respect to v. An acyclic graph has 
at least one topological ordering -<, that is, an ordering such that v — > w G D only 
if v -< w. In other words, w G P (v) implies w -< v. Hence, the family (P(v) : v G V) 
together with a topological ordering -< satisfies the conditions of Theorem [T] D 

Another straightforward observation is that the map 4>g cannot be generically 
finite-to-one if the dimension of the domain of definition M.® x PD(B) is larger 
than the space ofmxm symmetric matrices that contains the image of <J)q- This 
occurs if \D\ + \B\ is larger than (™). Theorem covers this observation. 

Proposition 2. If a mixed graph G = (V, D, B) with V = [m] has \D\ + \B\ > (™ ) 
edges, then G is HTC-infinite-to-one. 

Proof. Suppose G is not HTC-infinite-to-one. Then there exists subsets (Y v : v G 
V) , where each Y v satisfies the half-trek criterion with respect to v and for any pair 
of sets (Y V ,Y W ) it holds that v G Y w implies w $ Y v . 

Fix a node v G V. For every directed edge u — > v G D, there is a corresponding 
node y G Y v for which it holds, by Definition|4j that y <-> v g" B. Therefore, if there 
are d v directed edges pointing to v, then there are d v nodes, namely, the ones in 
Y v , that are not adjacent to v in the bidirected part (V, B). If we consider another 
node w G V, with d w parents, then there are again d w non-adjacencies {w,u>}, 
u G Y w , in the bidirected part. Moreover, {v, w} cannot appear as a non-adjacency 
for both node v and node w because of the requirement that v G Y w imply w $ Y v . 
We conclude that there are at least \D\ non-edges in the bidirected part. In other 
words, |D| + |B|< (™). ' D 
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We conclude the discussion of Theorems [l] and [2] by pointing out that HTC- 
identifiability is equivalent to a seemingly weaker criterion. 

Definition 5. A set of nodes Y c V satisfies the weak half-trek criterion with 
respect to node v € V if 

a) \y\ = \p(v)\, 

(ii) Yf]{{v}US(v)) =0, and 

(hi) there is a system of treks with no sided intersection from Y to P(v) such 
that for any w € Y l~l H{v), the trek originating at w is a half-trek. 

Lemma 1. Suppose the set W C V satisfies the weak half-trek criterion with respect 
to some node v. Then there exists a set Y satisfying the half-trek criterion with 
respect to v, such that Y n H(v) = W n H(v). 

Lemmafllis proved in the appendix, ft yields the following result, which is proved 
in Section [81 

Theorem 3 (Weak HTC). Theoremsyn and\tyhold when using the weak half-trek 
criterion instead of the half-trek criterion. Moreover, a graph G can be proved 
to be rationally identifiable (or generically infinite-to-one) using the weak half-trek 
criterion if and only if G is HTC-identifiable (or HTC-infinite-to-one) . 

4. G-CRITERION 

The G-criterion, proposed in |BP06] , is a sufficient criterion for rational identifi- 
ability in acyclic mixed graphs. The criterion attempts to prove the fiber .F(A, fi) 
to be equal to {(A, $7)} by solving the equation system 

T, = (I - A)- T Q.(I - A)' 1 

in stepwise manner. The steps yield the entries in A column-by-column and, si- 
multaneously, more and more rows and columns for principal submatrices of fl. As 
explained in Section [HI the new half-trek method we proposed in Section [3] starts 
from an equation system that has O eliminated and then only proves the entries of 
A to be uniquely identified. In this section, we show that, due to this key simpli- 
fication, the sufficient condition in the half-trek method provides an improvement 
over the G-criterion for acyclic mixed graphs. 

To prepare for the comparison of the two criteria, we first restate the identifia- 
bility theorem associated to the G-criterion in our own notation. Enumerate the 
vertex set of an acyclic mixed graph G according to any topological ordering as 
V = [m] = {1, . . . , m}. (Then v — > w only if v < w.) Use the ordering to uniquely 
associate bidirected edges to individual nodes by defining, for each v E V, the sets 
of siblings S < (v) — {w e S(v) : w < v} and S>(v) = {w € S(v) : w > v}. For a 
trek 7r, we write £(71") to denote the target node; that is, it is a trek from some node 
to t(w). 

Definition 6 (U3P06 ). A set of nodes A C V satisfies the G-criterion with respect 
to a node v*EVifA<zV\ {v} and A can be partitioned into two (disjoint) sets 
Y, Z with \Y\ = \P(v)\ and \Z\ = IS^i;)!, with two systems of treks II : Y =4 P(v) 
and ^ : Z =4 S<(v), such that the following condition holds: 

If each trek ir G II is extended to a path it' by adding the edge t{%) —¥ v to the 
right-hand side, and each trek n S \t is similarly extended using t(ir) ■<-> v, then the 
set of paths {it' : n £ II U \1/} is a set of treks that has no sided intersection except 
at the common target node v. 
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'5) (Tu k(T) (s: 




(d) ^ ^ (e) 

Figure 3. Rationally identifiable mixed graphs. 



Note that the paths n' for n <E II are always treks. For if) £ <3/, the requirement 
that ip' is a trek means that ?/> cannot have an arrowhead at its target node. 

For the statement of the main theorem about identifiability using the G-criterion, 
define the depth of a node v to be the length of the longest directed path terminating 
at v. This number is denoted by Depth(w). 



Theorem 4 ( BP06 J). Suppose (A v : v £ V) is a family of subsets of the vertex set 
V of an acyclic mixed graph G and, for each v, the set A v satisfies the G-criterion 
with respect to v. Then G is rationally identifiable if at least one of the following 
two conditions is satisfied: 

(CI) For all v and all w £ A V) it holds that Depth(w) < Depthiv). 

(C2) For all v and all w £ A v n {H(v) U S> (u)J , the trek associated to node w 
in the definition of the G-criterion is a half-trek. Furthermore, there is a 
total ordering -< on V, such that if w £ A v n (H(v) U 5>(u)), then w -< v. 

We remark that the ordering -< in condition (C2) need not agree with any topo- 
logical ordering of the graph. When using only condition (CI) the theorem was 
given in |BP02a| . and the literature is not always clear on which version of the 
G-criterion is concerned. For instance, all examples in |CK10] can be proven to be 
rationally identifiable by means of Theorem [4] as stated here. 

We now compare the G-criterion to the half-trek criterion. We say that a graph 
G is GC-identifiable if it satisfies the conditions of Theorem ID The next theorem 
and the proposition that follows are proved in Section [10] They demonstrate that 
the half-trek method provides an improvement over the G-criterion even for acylic 
mixed graphs. 

Theorem 5. A GC-identifiable acyclic mixed graph is also HTC -identifiable. 

The graph in Figure [2] is HTC-identifiable, as was shown in Example [5] 

Proposition 3. The acyclic mixed graph in Figure[Eis not GC-identifiable. 



IDENTIFIABILITY OF LINEAR STRUCTURAL EQUATION MODELS 



11 



(a) 




(b) 






Figure 4. Generically infinite-to-one graphs. 



5. Examples 

In the previous section, the acyclic mixed graph from Figure [2] was shown to 
be HTC-idcntifiable but not GC-identifiable. In this section, we give several other 
examples that illustrate the conditions of our theorems and the ground that lies 
beyond them. The examples are selected from the computational experiments that 
we report on in Section [6j We begin with the identifiable class. 

Example 6. Figure[3\ shows 5 rationally identifiable mixed graphs: 



(a) 



(b) 
(c) 



(d) 
(e) 



This graph is simple and acyclic and, thus, HTC- and GC-identifiable; re- 
call Proposition \n There are pairs (A, 51) for which the fiber J- (A, 57) has 
positive dimension. By Theorem 2 in [DFS11] , removing the edge 1 f> 3 
would give a new graph with all fibers of the form J-(h, 57) — {(A, 51)}. 
The next graph is acyclic but not simple. It is HTC- and GC-identifiable. 
This acyclic graph is HTC-inconclusive. The bidirected part being con- 
nected, the example is not covered by the graph decomposition technique 
discussed in Section [?l 

This is an example of a cyclic graph that is HTC -identifiable. 
This cyclic graph is HTC-inconclusive. 



On m = 5 nodes, graphs with more than ( 2 ) = 10 edges are trivially generically 
infinite-to-one. The next example gives non-trivial non-identifiable graphs. 

Example 7. All 4 graphs in FigureU\ are generically infinite-to-one. The acyclic 
graph in (a) and the cyclic graph in (c) are HTC-infinite-to-one. The acyclic graph 
in (b) and the cyclic graph in (d) are HTC-inconclusive. 

Many HTC-inconclusive graphs have fibers that are of cardinality 2 < h < oo. 
An example of an acyclic 4-node graph that is generically 2-to-one was given in 
Bri04j. Our next example lists more graphs of this generically finite-to-one type. 

Example 8. Figure[^\ shows four mixed graphs that are HTC-inconclusive and not 
generically identifiable. All the graphs have fibers that are generically finite: 
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(a) 






(c) ^ ^ (d) 

Figure 5. Generically finite-to-one graphs. 



(a) This graph is generically 2-to-l. We note that the coefficients X V 5, v € [4], 
can be identified; that is, any two matrices A, A' appearing in the same fiber 
have identical fifth column. 

(b) Generically, the fibers of this graph have cardinality either one or three. 
For instance, let 



A 



2:i 



1. 



Wll — • • • — W55 — 1, W12 — W13 — Wis — 7 1 

5 
£>e/ine 

/(A12) = 529A? 2 - 460A? 2 - 3642Ai 2 - 2380Ai 2 - 4271. 

Then, not considering the non-generic situation with /(A12) = 0, we have 



\T(A,n)\ 



3 i//(Ai 2 )>0, 

1 *//(Ai 2 )<0. 



TTie polynomial f has two roots which are approximately —2.16 and 3.44. 

(c) As shown in [DFSllj . a cycle of length 3 or more is generically 2-to-l. 

(d) The next graph is not generically identifiable. Generically, its fibers have 
at least two elements but not more than 10. Using the terminology from 
Definition^ below, the graph has degree of identifiability 10. We do not 
know of an example of a fiber with more than two elements. 



6. Computational experiments 

When the number m of nodes in the graph is small, then the identification prob- 
lem can be fully solved by means of algebraic techniques. In this section we report 
on the results of an exhaustive study of all mixed graphs with m < 5 nodes as well 
as simulations for graphs with m = 6 and 7 nodes. In our exhaustive computations, 
counts of graphs refer to unlabeled graphs, that is, we count isomorphism classes 



IDENTIFIABILITY OF LINEAR STRUCTURAL EQUATION MODELS 



13 



Table 1 . Classification of unlabeled mixed graphs with 3 < m < 5 
nodes; column 'HTC gives counts of HTC-classifiable graphs. 





m 


= 3 


m 


= 4 


m = 


= 5 


Unlabeled mixed graphs 


Total 


HTC 


Total 


HTC 


Total 


HTC 


Acyclic, < (™) edges 

rationally identifiable 
generically finite-to-one 
generically oo-to-one 


22 

17 

5 


17 
5 


715 

343 

4 

368 


343 
368 


103,670 

32,378 

1,166 

70,126 


32,257 
70,099 


Acyclic, > (™) edges 


18 


852 


152,520 


Cyclic, < (™) edges 
rationally identifiable 
generically finite-to-one 
generically oo-to-one 


6 
2 
1 
3 


2 
3 


718 
239 

75 
404 


230 
383 


348,175 
91,040 
44,703 

212,432 


78,586 
202,697 


Cyclic, > (™) edges 


58 


9,307 


8,439,859 



of graphs with respect to permutation of the vertex set V — [m] . A general intro- 
duction to the algebraic techniques that underly our computations can be found 
in |CLO07j . The use of computer algebra for parameter identification problems is 
explained in [GPSS10J. We give some more details in Appendix [A} All algebraic 
computations were done using the software Singular DGPSllJ; the combinatorial 
criteria were implemented in R |R Dllj . 

The results for m < 5 are given in Table [T] This table distinguishes between 
acyclic and cyclic (that is, non-acyclic) graphs. In each case, we single out the 
graphs with more than (™) edges. These are trivially generically infinite-to-one and 
also HTC-infinite-to-one according to Proposition^ The remaining graphs are clas- 
sified into three disjoint groups, namely, rationally identifiable graphs, generically 
infinite-to-one graphs and generically finite-to-one graphs. The following notion 
makes the distinctions and terminology precise. Here, we let C£„ to be defined 



as 



but allowing for complex matrix entries. We write C 



symmetric m x m complex matrices. 



m x m 
sym 



D 

rcg 

for the space of 



Definition 7. Let G = (V, D, B) be a mixed graph. Then the complex rational map 
<^G,Cj obtained by extending the map cf>G to C^L x C™^ 1 ™, is generically h-to-one 
with heNU {oo}; and we call h = ID(G) the degree of identifiability of G. 



A mixed graph G is rationally identifiable if and only if its degree of identifiability 
ID(G) — 1. Similarly, G is generically infinite-to-one if and only if ID(G) = oo; in 



that case the fiber T(A, fi) C M^ g x PD{B) defined in (|l.4|) is generically of positive 



dimension. In Table [l] a graph G is generically finite-to-one if 2 < ID(G) < oo 
and, thus, the fiber J"(A,0) is generically finite with |J"(A,ft)| < ID{G). If ID{G) 
is finite and even, then G cannot be generically identifiable because polynomial 
equations have complex solutions appearing in conjugate pairs and .F(A, $7) always 
contains at least one (real) point, namely, the pair (A, il) itself. If ID(G) is odd, 
then we cannot exclude the possibility that the equation defining the fiber T(A, fi) 
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□ HTC-identifiable 

□ HTC-infinite-to-one 
■ HTC-inconclusive 



12 



13 



14 
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11 
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■ ■" 



ll 



l 



Figure 6. Classification of labeled mixed graphs with m = 7 
nodes. Each bar represents 5,000 randomly drawn graphs with 
fixed number of edges, ranging from 1 to (™) . 



generically only has one real point, leading to generic identifiability. However, we 
did not observe this in any examples we checked. 

Table [l] shows that our half-trek method yields a perfect classification of acyclic 
graphs with m < 4 nodes and cyclic graphs with m < 3 nodes. Among the acyclic 
graphs with m = 5 nodes and at most (™) = 10 edges, our method misses 121 
rationally identifiable graphs and 27 generically infinite-to-one graphs. The gaps 
are larger for cyclic graphs with at most 10 edges, but the method still classifies 
86% of the rationally identifiable graphs correctly and misses less than 5% of the 
generically infinite-to-one graphs. The degree of identifiability ID(G) of an acyclic 
graph G with 5 nodes can be any number in [4]. For example, the graphs in 
Figure pjfa) and (b) have ID{G) equal to 2 and 3, respectively. For a cyclic graph 
G with 5 nodes, the degree can be any number in [8] U {10}; recall the example in 
Figure [5jd). 

In our computations we tracked which acyclic graphs are rationally identifiable 
according to the G-criterion as in Theorem |4J Since the method depends on the 
choice of a topological ordering of the nodes, we tested each possible topological 
ordering of the nodes. Our computation shows that the G-criterion finds all ra- 
tionally identifiable acyclic graphs with m < 4 nodes. For m — 5, the G-criterion 
proves 31,830 acyclic graphs to be rationally identifiable, that is, it misses 427 of 
the HTC-identifiable acyclic graphs. 

Exhaustive computations become prohibitive for more than 5 nodes. Instead 
we randomly generated mixed graphs, with m = 6 or m = 7 nodes, and tested 
whether they are HTC-identifiable, HTC-infinite-to-one or HTC-inconclusive. More 
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Figure 7. An acyclic mixed graph shown in (a) and its two mixed 
components shown in (b) and (c). 

precisely, for each value n — 1,2,..., (™), we randomly sampled 5,000 labeled 
mixed graphs on m nodes with n edges, by selecting a subset of size n from the 
set of all possible edges, which consists of 2 ■ (™) directed edges and (™) bidirected 
edges. The results of these simulations for m = 7 are shown in Figure p] (The 
results for m = 6 were very similar and are not shown.) As can be expected, 
the proportion of graphs that are acyclic decreases as m increases. Among both 
acyclic and cyclic graphs with at most (Tj nodes, the proportion of graphs that are 
generically infinite-to-one increases as m increases. For each value of m, the vast 
majority of graphs that are rationally identifiable or generically infinite-to-one, are 
HTC-classifiable. Most but not all of the HTC-identifiable acyclic graphs are also 
GC-identifiable; the difference is too small to be visible in the figure. 

7. Decomposition of acyclic graphs 

In this section we discuss how, for acyclic graphs, the scope of applicability 
of our half-trek method can be extended by using a graph decomposition due to 
|Tia05| . Let G = (V, D, B) be an acyclic mixed graph, and let C\, . . . , Ck C V be 
the (pairwise disjoint) vertex sets of the connected components of the bidirected 
part (V, B). For j g [k], let Bj = B H (Cj x Cj) be the bidirected edges in the jth 
connected component. Define Vj to be the union of Cj and any parents of nodes in 
Cj, that is, 

Vj = CjU{P(v):v£Cj}, j = l,...,k. 

Clearly, the sets V\,...,Vu need not be pairwise disjoint. Let Dj be the set of edges 
v —*■ w in the directed part (V, D) that have v € Vj and w £ Cj. The decomposition 
of |Tia05j involves the graphs Cj = (Vj,Dj,Bj), for j e [k]. We refer to these as 
the mixed components G±, . . . , Gk of G. Figure [7] gives an example. 

The mixed components G%, . . . , Gj. create a partition of the edges of G. There is 
an associated partition of the entries of A £ M. D that yields submatrices Ai, . . . , A/, 
with each A^ S R Dj ; recall that for an acyclic graph K^ = R D . Similarly, from 
il <G PD(B), we create matrices f2i, . . . , Ofc with each f2j € PD(Bj), where PD(Bj) 
is defined with respect to the graph Gj, that is, the set contains matrices indexed 
by Vj x Vj. We define ttj by taking the submatrix Q,q. q. from $7 and extending it 
by setting (Clj) vv = 1 for all v E Vj\Cj. The work leading up to Theorems 1 and 
2 in |Tia05j shows that, for all j G [k], there is a rational map fj defined on the 
entire cone of m x m positive definite matrices such that 

f j o^ G (A,n) = 4 >G .(A j ,n j ) 

for all A £ M. D and $7 € PD(B). In turn, there is a rational map g defined 
everywhere on the product of the relevant cones of positive definite matrices such 
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that 

ff(0o 1 (A 1 ,n 1 ),...,0 Gfc (A fc ,n fc )) = te(A > n) 

for all AeR fl and ft £ PD(B). We thus obtain the following theorem. 

Theorem 6. For an acyclic mixed graph G with mixed components G\, . . . ,Gk, 
the following holds: 

(i) G is rationally (or generically) identifiable if and only if all components 

G±, . . . , Gk are rationally (or generically) identifiable; 
(ii) G is generically infinite-to-one if and only if there exists a component Gj 

that is generically infinite-to-one; 
(iii) if each Gj is generically hj -to-one with hj < oo then G is generically h-to- 



one with h = J\j 



=i h 3 



Tk 



We remark that this theorem could also be stated as ID{G) — Y[j=i ID(Gj), in 
terms of the degree of identifiability from Definition [7J 

The next theorem makes the observation that when applying our half-trek method 
to an acyclic graph, we may always first decompose the graph into its mixed com- 
ponents, which may result into computational savings. 

Theorem 7. If an acyclic mixed graph G is HTC -identifiable then all its mixed 
components G\, . . . ,Gk are HTC -identifiable. Furthermore, G is HTC-infinite-to- 
one if and only if there exists a mixed component Gj that is HTC-infinite-to-one. 

Proof. The claim about HTC-identifiability follows from Lemma [JJ in Section 11 



The second statement is a consequence of Lemmas [8] and |9j also from Section[TT] □ 

The benefit of the graph decomposition goes beyond computation in that it is 
possible that identification methods apply to all mixed components but not the 
original graph. In [TiaQ5], this is exemplified for the G-criterion. More precisely, 
the 4-node example given there concerns the early version of the G-criterion from 
BP02aj that includes only condition (CI) from Theorem p] but not condition (C2), 
which is due to [BP06] . However, graph decomposition allows one to also extend the 
scope of our more general half-trek method, where passing to mixed components 
can avoid problems with finding a suitable total ordering of the vertex set. Surpris- 
ingly, however, the extension is possible only for the sufficient condition, that is, 
HTC-identifiability; Theorem [7] gives an equivalence result for HTC-infinite-to-one 
graphs. 

Proposition 4. The acyclic mixed graph in Figure \Wa) is not PIT C -identifiable 
but both its mixed components are PTC -identifiable. 

Proof. Suppose for a contradiction that the original graph G is HTC-idcntifiable 
and that the sets Y$, I4 and I5 are part of the family of sets appearing in Theorem]!] 
In particular, each set has two elements and satisfies the half-trek criterion with 
respect to its subscript. Now, the presence of the edge 2 «-» 3 implies that T3 c 
{1,4,5}. Moreover, Y 3 =^ {1,4} because the sole half-trek from 4 to 3 has 1 in its 
right-hand side and all half-treks from 1 to 3 are directed paths and thus have the 
source 1 on their right-hand side as well. It follows that 5 £ T3 and, thus, 3 ^ Y5. 
Since 2 <H- 5 is in G, it must hold that I5 = {1,4}. Examining the descendant sets 
H{v) we see that the total ordering -< in Theorem fl] ought to satisfy 4 ~< 5 -< 3. 
Since 1 £ 5(4) and 3, 5 £ H(A), we conclude that I4 C {2}, which is a contradiction 
because Y4 must have two elements. 
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Turning to the mixed components of G, it is clear that the component shown in 
Figure [Wc) is HTC-identifiable because it is a simple graph; recall Proposition [T] 
The component in Figure ff\b) is HTC-identifiable because Theorem fl] applies with 
the choice of 

Y 1 =Y A = Q, Y 2 = {1}, n = {l,4}, F 3 = {1,5}, 

and any ordering that respects 5^3. □ 

As seen in Table [TJ the half-trek method misses 121 rationally identifiable acyclic 
graphs with 5 nodes, among them is the example from Proposition HI After graph 
decomposition, the half-trek method proves 9 of the 121 examples to be ratio- 
nally identifiable. The remaining 112 graphs all have a connected bidirected part; 
see Figure [31(c) for an example. On 5 nodes, there are 27 generically infinite-to- 
one graphs that are HTC-inconclusive. All of these have a connected bidirected 
part. (For larger graphs, we expect that there will be some graphs that are not 
bidirected-connected, where the half-trek method combined with decomposition 
will not apply.) 

8. Proofs for the half-trek criterion 

In this section we prove the two main theorems stated in Section [3] We begin 
with the identifiability theorem. 

Theorem IT] (HTC-identifiability) . Let (Y v : v e V) be a family of subsets of the 
vertex set V of a mixed graph G. If, for each node v, the set Y v satisfies the half-trek 
criterion with respect to v, and there is a total ordering -< on the vertex set V such 
that w -< v whenever w € Y v f] H(v), then G is rationally identifiable. 

Proof of Theorem^ Let £ = 0g(Ao, fio) be a matrix in the image of 4>g, given by 
a generically chosen pair (A , f2 ) £0 = ^-® co - x PD(B). For generic identifiability, 
we need to show that the equation 

(8.1) Y l = {I-k)- T n{I-K)- 1 

has a unique solution in O, namely, (A, ft) = (Aq, flo)- However, a pair (A, ft) solves 



(8.1 ) if and only if 

(8.2) [(I-A) T E(I-A)] vw = V(v,w)ftBaadv?w, 
and 

(8.3) [(J - A) T E(I - A)] = fi™ V{v,w)eBorv = w. 



The non-zero entries of fl appearing in (8.3) are freely varying real numbers that 



are subject only to the requirement that Q be positive definite. For cyclic graphs, 



1) contains rational equations. Hence, the focus is on (8.2 1, which defines a 



polynomial equation system even when the graph is cyclic. 



We prove the theorem by solving the equations (8.2 1 in stepwise manner accord- 
ing to the ordering -<. When visiting node v, the goal is to recover the v th column 
of A as a function of S. Based on solving linear equation systems, the functions of 
£ that give the entries of A will always be rational functions, proving our stronger 
claim of rational (as opposed to mere generic) identifiability. 
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For our proof we proceed by induction and assume that, for all w -< v, we have 
recovered the entries of the vector Ap/ W \ w as (rational) expressions in S. To solve 
for A. P ( v ) tV , let Y v = {y u ...,y n } and P(v) = {pi, ■ ■ . ,p n }. Define A g R nxn as 

Aij = |[(^A)^] m . *W €#(«), 
[E^p^. if yigH(v). 

Define b g M™ as 

bi = /[( J - A ) TS ]^ **€*(«), 
{s yi „ ifyi$:H(v). 

Note that both A and b depend only on S and the columns A.pr w ),w with w € 
3^i (1 H(y), which are assumed already to be known as a function of £ because 
w € Y v f] H(v) implies w -< v. We now claim that the vector Aj>(„) j0 solves the 
equation system A • Apr v \ v = b. 

First, consider an index i with yi € Y v f] H(v). Since Y v satisfies the half-trek 
criterion with respect to v, the node j/j ^ v is not a sibling of u. Therefore, by 

(HI), 

[(/-AfS(I-A)]^ = =» [(/-AfSA] Kt; =[(/-ArS]^. 
It follows that 



(A.A P(K)> „). = £[(I-A) T E)] A p „ 






[(/-Af S A]^=[(/-Af£]^=b., 
Second, let z be an index with yi G Y v \ H(v). Then 

n 

(A • A P(o)>0 ) . = ]T S„p ,A P .„ = [£A]^ = [(I - A)- T tt(/ - A)- 1 ^ . 
i=i 

By definition of H(v), we know that [(/ — A) _T fi] 9it , = 0. Adding this zero and 
using that (I - A) -1 = /+(/- A) _1 A, we obtain that 

(a • A P{vU ,) t = [(i A)- T n(i A)-*A] yiV + [(i - A)- T n] mv = 

[(i-A)- T n(i-A)-i] mv = x 



Therefore, A • Apt v ) >v — b, as claimed. 

By Lemma [2] below, the matrix A is invertible in the generic situation. There- 
fore, we have shown that Ap/ V y v = A _1 b is a rational function of E. Proceeding 
inductively according to the vertex ordering -<, we recover A P r v \ v for all v and, 
thus, the entire matrix A, as desired. □ 

Lemma 2. Let v e V be any node. Let Y C V\({v} U S(v)), with \Y\ = \P(v)\ = 
n. Write Y — {j/i, . . . , y n } and P(v) = {p\, . . . ,p n }, and define the matrix A as 

A ..i [(/-A) T E] WP ., yi €H(v), 



13 I Sv W , y l ^H(v). 

IfY satisfies the half-trek criterion with respect to v, then A is generically invertible. 
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Proof. Recall the trek-rule from (2.3 1. Let %{v,w) C T(v,w) be the set of all 



half-treks from v to w. Then, for each i,j G {1, . . . , n}, 

A .. = fE ffg «( tt ^)»r(A,w), y l eH(v), 

lE 7re rte«) 7r ( A ' tJ )' y l ^H{v). 

For a system of treks II, define the monomial 

n(A,w)= n^( A ' w )- 

wen 
Then _ 

det(A)= 2 (-l) 1 * 1 ^^), 

where the sum is over systems of treks \P for which all treks tp £ ^> with sources 
in H(v) are half-treks. (The sign |<3/| is the sign of the permutation that writes 
Pi, ■ ■ ■ ,p n in the order of their appearance as targets of the treks in ty.) 

By assumption, there exists some system of half-treks with no sided intersection 
from Y to P. Let II be such a system, with minimal total length among all such 
systems. Now take any system of treks \& from Y to P, such that II(A, w) = ^(A, u). 
(We do not assume that "J/ has no sided intersection, or has any half-treks). In 
Lemma [3] immediately below, we prove that "J = II for any such ^. Therefore, the 
coefficient of the monomial II(A, oS) in det(A) is given by (— l)' n ', and det(A) is 
not the zero polynomial/power series. For generic choices of (A, O) it thus holds 
that det(A) ^ 0. □ 

Lemma 3. Suppose Y, P C V are subsets of equal cardinality, and II : Y =4 P is a 
system of half-treks with no sided intersection, with minimal total length among all 
such systems. If for a system of treks \& :Y =4 P the monomial ^(A, CI) = II(A,u;), 
then * = LI. 

Proof. Let Y = {yi,..-,y n }, P = {pi,-- -,Pn}, and II = {ki,. . . , 7r n }, where 7T; 

has source yi and target pi . Since II has minimal total length among all systems of 

half-treks from Y to P with no sided intersection, II cannot have a sub-system of 

the form 

/ 

""ii ■ 2/ii ' ' ' y%2 ' ' ' P'h ! 

71*2 ■ J/J2 ' ' ' 2/»3 ' ' ' Ph ! 

< : 

Kir-l ■ Vir-1 ■ ■•Vir ■■■Pir-1 

.7Tt r : yir -••yii-'Ptr- 

If there were such a sub-system, each trek in the sub-system could be shortened, that 
is, replace tt^ : y^ ■ ■ ■ y% 2 ■ ■ ■Pi 1 with its second section, y i2 ■ ■ -pi 17 etc. Therefore, 
we can relabel the elements of Y, P and II such that j < i if trek 7r.; contains yj. 

Write the second system of treks as W = {V>i, ■ • ■ , ip n }i where tpi has source yi 
and target p a (i). Here, a is some permutation of the indices in [n\. We claim 
that a(n) = n and ip n = 7r n . Assuming this is true, let Y' = {y\, . . . ,y„_i} and 
P' = {pi, . . . ,p n -i}, and let II' and H?' be the induced sub-systems of treks from 
Y' to P' . The ordering on Y' follows the same rule as the ordering on Y. Then 
VVi = ^n implies that LI'(A,w) = n(A, w)/7r ra (A,w) = $(\,w)/ip n (A, w) = ^'(AjW). 
By induction on n, we conclude that $ = n. 
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It remains to show that a(ri) — n and ip n — n n . Write 

(8.4) tt„ : y n o^ ^^.,,4 z£ = p n , 

where y n o— >• z™ represents that either y n — > z™ or y n -(-> z™. By definition of the 
ordering on Y, the node y n does not appear in any trek in II, except for 7r„. And, 
node y n appears only once in 7r„, since II has minimal total length. Hence, the only 
edge in II containing y n is the edge y n o—> z™. Since \I/(A, f2) = I1(A, w), this implies 
that the only edge in Vf containing y n is the same edge y n o—>Zi. Therefore, ip n 
must be of the form 

Case 1: The path tp n consists of only the edge y„o— s-z™. Then z" £ P. If 
z" = pj for j < n, then 7r would have a sided intersection, which is a contradiction. 
Therefore, z™ = p n . Since II is a system of minimal length, n n must also consist of 
only y n o— > z™ = p n , which show that i/'n = ""« • 

Case 2: The path t/; n is of the form 

Ipn ■ Vn °-> Z? -> • • • 

Since II has no sided intersection, there is no edge of the form p n — > • in II. Since 



\I/(A, O) = II(A, w), we obtain that z n ^ p n , and thus k > 2 in (8.4|. Now observe 
that the only edge of the form z™ — > • in II, is the edge z™ — > z%\ otherwise, two 
treks in II would have a sided intersection at z n . It follows that 

ip n : y n o-> zl -> z% ■ ■ ■ . 

Continue now to add edges one at a time to the path ip n , applying the reasoning 
just used at all but the last edge of ^ n . Reasoning as in Case 1 for the last edge of 
tp n , we find that %p n and ir n are both equal to 

y n o^ zl -)• z n -t . . . -> z™ = p n ■ 

This completes the proof that a(n) — n and ip n = ir n . D 

We now turn to the proof of non-identifiability theorem. 

Theorem ^1 (HTC-non-identifiability) . Suppose G is a mixed graph in which every 
family (Y v : v G V) of subsets of the vertex set V either contains a set Y v that 
fails to satisfy the half-trek criterion with respect to v or contains a pair of sets 
(Y V ,Y W ) with v £ Y w and w £ Y v . Then the parametrization 4>q is generically 
infinite-to-one. 

Proof of Theorem [M Let 

N = {{«, w} : v y^ w, (v, w) $ B} , 

be the set of (unordered) 'nonsibling pairs' in the grap h. Treating E as fixed, let 
J £ IRl Ar l x l Z3 l be the Jacobian of the equations in ( |8.2[ ), taking partial derivatives 
with respect to the non-zero entries of A. The entries of J are given by 

(8.5) J{«,»},(«,i») =-[(!- A ) Ts ] wu > < w ' w}£N,u£ P(v), 

and all other entries zero. By Lemma [4] below, it is sufficient to show that, under 

the conditions of the theorem, J does not have full column rank. 

In the remainder of this proof, we always let E = <pQ (A, CI) when considering 
J. If J has generically full column rank, then we can choose a set M C N with 
\M\ = \D\ = J2vev \-P( v )\> suc h that det(JM,zj) is not the zero polynomial, where 
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Jm,d is the square submatrix formed by taking all rows of J that are indexed by 
M. By the definition of the determinant, there must be a partition of M — U V M V 
such that for all v, we have 

det (Jm v ,(p(v),v)) 7^ • 

By ( |8.5| , each entry {w\,W2} € M v must have either w\ — v or w<2 — v. Writing 
Y v = {w : {v,w} £ M v }, it holds that 



det 



[[(I - A) T S] yijP(t)) j = ±det (J{y v ,v},{p(v),v)) = ±defc (J Mv ,(p(v),v)) 



is non-zero. By Lemma [5] below, this implies that each set Y v satisfies the half-trek 
criterion with respect to its indexing node v. Forming a partition of M C N, the 
sets M v are pairwise disjoint. Hence, no two nodes v,w can satisfy both v € Y w 
and w £Y V because otherwise {v, w} £ M v n M w . D 



Lemma 4. Define J as in (8.5). If J does not have full column rank, then the 



parametrization <j)Q is generically infinite-to-one. 

Proof. The parametrization <pQ maps the (|JD| + \B\ + 7n)-dimensional set O = 
K^ x PD(B) to the ( m ^ 1 ) -dimensional space of symmetric to x m matrices. Since 
4>G is a rational map, its Jacobian matrix J(4>g) achieves its maximal rank at 
generic points in 0. This maximal rank is the dimension of the image of cj>q. If the 
dimension is smaller than \D\ + \B\ +m, then, for generic choices of (A, 17) £ 9, the 
fiber J-"(A, Q) has positive dimension and is, in particular, infinite. Therefore, our 
theorem is proven if we can show that, under the assumed conditions, the Jacobian 
of (j)Q does not have a full column rank. 

We now claim that the Jacobian of <j>g, J{4>g): is of full column rank at (A, CI) 
if and only J has full column rank at A when taking S = 0g(A, 51). 

Consider the two maps 



(8.6) ft:(A,fi)H-»(A,<fc(A,fi)) and g : (A,£) h+ (I- A) T £(I- A), 

l D 

rcg 



where the domain of h is and the domain of g is M.® x PD m . The composition 



of the two maps satisfies 

(8.7) (goh){A,n) = fl. 

Partition J(4>g) = {Ja{4'g)j Jci(4'g))i where the two parts hold the partial deriva- 
tives with respect to the \D\ free entries of A and the \B\ + to free entries of 
57, respectively. Similarly, partition the Jacobian J(g) = (JA_(g), Js(s))- Taking 



derivatives in (8.7), we obtain that 

(8.8) Ja(.9)(A, g (A, 51)) + J S ( 5 )(A, G (A, Q))J A (<p G )(A, 51) = 

and 

'0 X 



(8.9) J s (. 9 )(A,0 G (A,51))Jo(0 G )(A,51)- ii( ; 

where we have ordered rows and columns such that the pairs (v, w) defining elements 
in N are listed first. Hence, the identity matrix in the lower-right block of the right- 



hand side of (8.9) is of size |-B|+m, and indexed by BUV. Under the same ordering 



of rows, observe that using the Jacobian in (8.5) we have 



./,(/ A )(A.,,,;(.\.Oi) = ( Jis =^ A ^ 
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Combining (8.8) and (8.9), we obtain 



(8.10) J s ( 5 )(A,0 G (A,fi)).J(^ G )(A,n)=f J l^o(A,n) J 

where the two blocks of rows are indexed by N and BUV, and the three blocks of 
columns are indexed by D, BUV, and N. Now note that the restriction of g obtained 
by fixing A is an injective map with continuous inverse S i— > (I — A)~ T £(/ — A) -1 . 
Therefore, the matrix Js(g) is invertible, and we deduce that the rank of J{<P>g) & t 
(A, 51) is equal to the sum of \B\ + m and the rank of J at A and S = 0g(A, f2). 
This proves our claim relating the rank of J(4>g) and that of J. □ 

Lemma 5. Let v € V be any node. Let Y C V\({v} U S(v)), with \Y\ = \P(v)\ = 
n. If the matrix J = [(I — A) T,]y,p! v ) is generically invertible, then Y satisfies the 
half-trek criterion with respect to v. 

Proof. Abbreviate P = P(v). We have J = [(J - A) T S] YiP = [Q(I - A)" 1 ]^. 
Hence, 

det(J) = Y^ det(Sl Y ,w)det((I - A)w,p) ■ 

WC.V,\W\=n 

By assumption, dct(J) is not the zero polynomial/power series. Therefore, for some 
W cV with \W\ = n, we have det(Oy;w) ^ and det((7 - A)w,p) # 0. 

By Menger's theorem (see, for instance, Sch04, Theorem 9.1]), the non- vanishing 
of det((7 — A)ji,p) implies that there is a system \t of pairwise vertex-disjoint 
directed paths ipi : Wi — >• . . . — > Pi, i G [n], whose sources and targets give W — 
{wi, . . . , w n } and P = {p\, . . . ,p n }, respectively. Indeed, if no such system exists, 
then by Menger's theorem there is a set C of strictly less than n vertices such 
that all directed paths from W to P pass through C . But this implies that the 
matrix (I — A)^p factors as (I — A)^ 1 ,^ • (I — A)q p , and \C\ < n implies that 
det((I — A)^-p) = 0, a contradiction. Note that by erasing loops, we can further 
arrange that the ipi do not have self-intersections. 

Since det(Qy,w) ¥" 0? we can index Y — {?/i, . . . ,y n } such that £ly iWi ¥" for 
all i. This implies that either yi = Wi or yi o Wi G B. Now define a system 
of half-treks II : Y =J P by setting 7T; = ipi if Wi — yi, and extending ipi at the 
left-hand side to 

7T 4 = Vi «■ U'j ->• • • • "> Pi 

if j/i ^ Wi. Since ^ has no sided intersection, n also has no sided intersection. It 
follows that Y satisfies the half-trek criterion with respect to v. □ 

9. Proofs for the weak half-trek criterion 

Lemma [l] Suppose the set W C V satisfies the weak half-trek criterion with respect 
to some node v. Then there exists a set Y satisfying the half-trek criterion with 
respect to v, such that Y n H(v) = W D H(v). 

Proof. Let n : W =£ P(v) be a system of treks satisfying the conditions of the weak 
half-trek criterion. Let r be the number of treks in n which are not half-treks, and 
suppose r > 0. Using induction, it suffices to show that there is a set W satisfying 
the weak half-trek criterion with respect to v via some trek system IT with no more 
than i — 1 treks that are not half-treks, and for which W' PI H(v) = W l~l H(v). 
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Take any w £ W for which the trek tt £ II with source w is not a half-trek. 
By the definition of the weak half-trek criterion, this implies that w £" H(v). Let 
w' y^ w be the (unique) node in the left-hand side of tt that is closest to the target 
of tt, which we denote t(n). The trek tt has the structure 

w <—■■•<— w ■■ ■ t(n). 

Let 7r' be the subtrek from w' to £(7r). Then 7r' is a half-trek. Since w is a descendcnt 
of w' and u; (jL H(v), this implies w/ $■ i?(w) and w ^ S(v). Furthermore, w' (jL 
W\\w] because 7r has no sided intersection and w' is in the left-hand side of tt. 

Define W = (W \ {w}) U {w'} and n' = (II \ {tt}) U {tt'}. Since the original 
system of treks II had no sided intersection, the new system of treks II' also has no 
sided intersection. Precisely r — 1 of the treks in II' are not half-treks. Moreover, 
since w, w' $■ H(v), it holds that W'nH(v) — Wf)H(v), as needed to be shown. □ 

Theorem [3] (Weak HTC). Theorems^ andffihold when using the weak half-trek 
criterion instead of the half-trek criterion. Moreover, a graph G can be proved 
to be rationally identifiable (or generically infinite-to-one) using the weak half-trek 
criterion if and only if G is HTC -identifiable (or HTC-infinite-to-one) . 

Proof. It is sufficient to show the following two facts: 

(a) If G can be proved to be rationally identifiable using the weak half-trek 
criterion, then G is HTC-identifiable. 

(b) If G cannot be proved to be generically infinite-to-one using the weak half- 
trek criterion, then G is not HTC-infinite-to-one. 

Part (a). A graph G can be proved to be rationally identifiable using the weak 
half-trek criterion if there is, for each v, a set of nodes W v satisfying the weak 
half-trek criterion with respect to v, and an ordering -< such that w -< V for any 
w e W v (lH(v). By LemmafTlbelow, for each v, there is then also a set Y v satisfying 
the half-trek criterion with respect to v, with Y v n H(v) = W v n H(v). Therefore, 
G is seen to be HTC-identifiable using the same ordering -<. 

Part (b). If G cannot be proved to be generically infinite-to-one using the weak 
half-trek criterion, then there is a family (W v : v £ V), such that each W v satisfies 
the weak half-trek criterion with respect to v, and v £ W w implies w $• W v . Using 
Lemma [TJ we can find, for each v, a set Y v that satisfies the half-trek criterion 
with respect to v and for which Y v n H(v) — W v fl H(v). Now suppose v £ Y w 
for two nodes v, w £ V. This means that v $_ S(w) U {w} and there is a half- 
trek 7r with source v and target w, which implies that w £ H(y). If also w £ Y v , 
then w £ Y v f] H(v) = W v fl H{v). By symmetry, we also get v £ W w . This 
contradicts our assumption, and so w (jL Y v . This proves that G cannot be proved 
to be generically infinite-to-one using the half-trek criterion. □ 

10. Proofs for half-trek versus G-criterion 

In this section, we assume that G is an acyclic mixed graph whose vertex set V = 
[m] is enumerated according to some topological ordering under which Theorem Pfl 
applies, making the graph GC-identifiable. Let A v be the sets from Theorem |4j 
Recall Definition [6j for each node v £ V, let Y v U Z v — A v be the partition that, 
together with the systems of treks H v : Y v =J P(v) and ^ v : Z v =4 S < (v), witnesses 
that A v satisfies the G-criterion with respect to v. For each v, for each 7rell„ (and 
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each ip £ & v ), define n' (or ip') by extending ir (or ip) with the edge t(n) — > v (or 
t(ip) -H- v), as in Definition [6] 

Lemma 6. Consider any node v £ V. If w £ Left (tt) for some trek tt £ IL V , then 
w ^ v and w (jL S(v). 

Proof. Let y and t(n) be the source and the target of tt, respectively. 

First, suppose that w < v. If w £ S(v), then there is a trek ip £ ty v with source 
z £ Z v and target w that extends to a trek ip' when appending the edge w ■<->• v 
to the right-hand side. Since there is a sided intersection between ip' and tt', we 
cannot have w £ S < (v). 

Next, suppose that w > v, and condition (CI) of Theorem His satisfied. If w = v, 
then Depth(y) > Dcpth(u>) = Dcpth(w) gives a contradiction to (CI). If instead 
w > v, then Depth(w) < Depth(y) < Depth(w), by (CI). Suppose w £ S(v), and 
consider A w . Since v £ S < (w), there is a trek ip' of the form z ■ ■ ■ v ■<-> w with 
source z £ Z w . But then Depth(z) > Depth(w) > Depth(u>), which contradicts 
(CI). Hence, we cannot have w = v or w £ S > (v) if condition (CI) is true. 

Next, suppose that w — v, and condition (C2) of TheoremElis satisfied. If n is a 
half-trek, then v — w — y £ A v , a contradiction. If w = v and n is not a half-trek, 
then y is a proper descendent of w = v, and so y £ H(v). But then (C2) requires 
that 7r be a half-trek, a contradiction. Therefore, w ^ v if condition (C2) is true. 

Finally, suppose that w > v, and condition (C2) of Theorem El is satisfied. If w £ 
S(v), then y = w or y is a proper descendent of w. In either case, y £ iJ(w)U > S'>(v), 
and so n must be a half-trek. It follows that 7r has source node w — y, which implies 
that w -< v in the ordering specified by condition (C2). We now consider A w . Since 
v £ S<(w), there is a trek ip' of the form z ■ ■ -v O w with source z £ Z w . Then 
v £ H(w) because of the half-trek n' from w to v. Moreover, either z — v or z 
is a proper descendent of v. Therefore, z £ H(w), and so ip' must be a half-trek, 
implying that z = v. It follows that v £ A w n H(w), and so v -< w in the ordering 
specified by condition (C2). This is a contradiction. Therefore, we cannot have 
w £ S> (v) if condition (C2) is true. D 

We now prove the theorem. 

Theorem [5l A GC -identifiable acyclic mixed graph is also HTC -identifiable. 

Proof. First, consider the case that condition (CI) of Theorem H holds. For each 
v, we can uniquely decompose each trek 7r £ H v as 

y(w) < <— S/* (tt) • • - t(n), 

where y(7r) £ Y v is the source and t(n) £ P(v) the target of n, and the subtrek 7r* 
from y*(n) to t(n) is a half-trek. By Lemmapl y* ^ v and y* ^ S(v). Furthermore, 
for two distinct treks tti,tt2 £ H v , we must have y*(iri) ^ y*(7T2), because otherwise 
there would be a sided intersection between the extensions ir[ and tt' 2 of n\ and 7T2, 
respectively. Now define Y* — {y*(n) : tt £ U v }. Using the system of half- 
treks $„ = {71% : 7r £ n„}, we see that Y* satisfies the half-trek criterion with 
respect to v, for each v. Finally, define a total ordering -< on V that agrees with 
the partial ordering induced by depth. Observe that for all v, w, it holds that 
Depth (y*(7r)) < Depth (3/(71")) < Depth(u), by condition (CI). Hence, for any 
y € Y* PI i? (v), we must have Depth (y) < Depth(w), and so y -< v. Consequently, 
the conditions of Theorem [T] are satisfied, and the graph G is HTC- identifiable. 
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Next, consider the case that condition (C2) of Theorem El holds. For each v, by 
LemmaB] Y v is disjoint from S(v) U {v}. By the G-criterion, the system of treks 
II„ : Y v =4 P(v) has no sided intersection. By condition (C2), a trek ir £ H v is a 
half-trek whenever the source y(iv) £ Y v <~) H(y). Therefore, the set Y v satisfies the 
weak half-trek criterion with respect to v. Finally, take the ordering -< specified 
by condition (C2). For each y £ Y v D H(v), we must have w -< v for all w £ Y v , 
by condition (C2). Therefore, using the weak half-trek method in Theorem [31 the 
graph G is seen to be HTC-identifiable. □ 

Proposition [3} The acyclic mixed graph in Figure [J| is not GC -identifiable. 

Proof of Proposition^ First, note that the sibling sets S < (v) are unique because 
this graph G has a unique topological ordering. Next, observe that with both 
1 -> 2 and 1 O 2 in the graph, |P(2)| + |S<(2)| = 2. But only node 1 has depth 
smaller than node 2. Therefore, G cannot be GC-identifiable via condition (CI) of 
Theorem El and it remains to consider condition (C2). 

Node 2: We have S < (2) = {1} and must therefore find a set Z 2 — {v} such 
that there exists a trek 7r of the form v <— ■ ■ ■ <— 1 -f> 2. liv £ {3, 4, 5}, then 
v £ Z2<~]H(2), and 7r would need to be a half-trek, which is a contradiction. 
We conclude that Z 2 = {1}, implying 1 £• Y2. The parent set of node 2 is 
P(2) = {1}, and we must find Yj = { v } for a node v £ {3, 4, 5} that is the 
source of a trek 7r of the form v ■ ■ ■ 1 — ¥ 2. Since {3, 4, 5} C H(2), the trek 
7r must be a half-trek. This restricts the choice too £ {4,5}. Therefore, 
either 4 e F 2 n ff(2) or 5 e F 2 n H(2). Hence, either 4 -< 2 or 5 -< 2. 

-/Vode 4-' Starting from 3 £ 5<(4) and reasoning as for Z 2 before, we must 
have 3 £ Z4 and, consequently, 3 $• I4. Since P(4) = {3}, we must have 
I4 = {«} with a trek 7r of the form v ■ ■ ■ 3 — > 4. The set Z4 must contain 
a node w at the source of a trek u? • • • 1 <H- 4. Hence, u = 1 cannot be 
in y 4 since a sided intersection in the system of treks would be created. 
Therefore, we must have v £ {2, 5} C H(4), and so. It follows that either 
2 -<; 4 or 5 -< 4. 

Node 5: We have 4 £ 5<(5). By the same reasoning as for Z 2 and Z4, it holds 
that 4: £ Z 5 . Since 4 e -ff(5), this means that 4^5. 

We conclude that a total ordering ~< as required for GC-identifiability would have 
to satisfy either 4^5-<4, or 2^4^ 2, or2^4^5-<2. Consequently, no such 
ordering exists. □ 

11. Proofs for graph decomposition 

Lemma 7. Let v be a node in the mixed component G' of an acyclic mixed graph G. 
Consider the set H(v) in G, and let H'(v) be the analogue in G' . If there is a set Y 
that satisfies the half-trek criterion with respect to v in G, then there is a set Y' that 
satisfies the half-trek criterion with respect to v in G' , and Y' n H' (y) £ Y (~) H(v). 

Proof. Let V' be the vertex set of G' , and let C" C V be the vertex set of the 
bidirected connected component of G that defined G' . We may assume that v £ C' , 
for otherwise v has no parents in G' and the claims concern empty sets. Choose a 
system of half-treks n : Y =4 P(v) with no sided intersection and with Y n ({v} fl 
S(v)) = 0. Since P(v) C V, each half-trek n £ H eventually visits only nodes 
in V' . Now take n' to be the set of half-treks obtained by retaining the longest 
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subtrck of each half-trek tt G II that remains entirely in G" and contains the target 
of 7r. If 7r' € II' is derived from tt G II, then either (i) it' — ir or (ii) 7r' is a directed 
path and its source y' is an element of V \C. 

First, we claim that II' is a system of half-treks. In other words, we claim that 
the sources y[ and y 2 of two distinct half-treks 7ri,7r 2 G II' satisfy y[ ^ y' 2 . Let 
7Ti and 7T2 be the half-treks in II that yielded 7rJ and 7r 2 , respectively. Since II is 
without sided intersection, y[ ^ y' 2 if both tv^ = ~K\ and tt 2 — tt 2 . If, without loss 
of generality, ir^ ^ n\, then y[ G Right (tti), and y[ G" C". Now suppose y^ = v'i- 
Since II has no sided intersection, we must have y 2 Right (772). This implies that 
7r 2 starts with a bidirected edge and, thus, 7r 2 = 7r2, and therefore y 2 G C", while 
y[ £ C". Consequently, y[ ^ y' 2 . 

Second, we claim that II' has no sided intersections. Consider any 7ri,7T2 G II. 
Since 7rJ and 7r 2 are half-treks, Left (7^) = {y[} and Left (7T 2 ) = {y' 2 }. Above, we 
showed that y[ 7^ y' 2 , and therefore Left (n[) fl Left (ir' 2 ) = 0. Next we consider the 
right-hand sides. By definition of 71^ and 7r 2 , we have Right (7^) C Right (7Ti) and 
Right (tt 2 ) C Right (tt 2 ). Therefore 

Right (71-i) n Right (tt 2 ) C Right (tti) n Right (tt 2 ) = . 

Third, we claim that H' satisfies the half-trek criterion with respect to v in the 
component G". For this it remains to show that no source in Y' is equal to v or a 
sibling of v. Indeed if the source y' of a half-trek n' <G II' is in S(v) U {v} C C", 
then 7r' = 7r and we have a contradiction to Y n ({u} U 5(w)) = 0. 

Finally, we claim that Y' H' (v) C Y H(v). Since G" is a subgraph of G, we 
have H'(v) C H(y). Our claim thus holds because all nodes in Y' \ Y are in V \ C 
and there are no directed edges pointing to nodes in V' \ C in the graph G'. D 

Lemma 8. Let G\, . . . , G& be the mixed components of an acyclic mixed graph G. 
Suppose that G\, . . . , Gu are not HTC-infinite-to-one. Then G is not HTC-infinite- 
to-one. 

Proof. For each j £ [k], choose a family (Yy : v € Vj) where each set Y} 3 ' C Vj 
satisfies the half-trek criterion with respect to v in the component Gj, and for all 

v,w <E Vj, either v (I Yw or w (£ Yy'. Such a family exists by the assumption. 

For v € V, let j(v) be the unique index in [k] with v <E Cji v y Define Y v = Y} v . 
The original graph G is seen not to be HTC-infinite-to-one, if the following two 
claims are proven: 

(a) in G, each set Y v satisfies the half-trek criterion with respect to its indexing 
node v; 

(b) for each v y^ w E V, cither v ^ Y w or w £ Y v . 

Proof of claim (a): Fix any v and abbreviate j — j(v). By definition, Y v = Y v 
satisfies the half-trek criterion with respect to v in Gj. This implies that there is 
a system II of half-treks with no sided intersection from Y v to P(v) fl Vj, and that 
Y v C Vj\ (v U (S(v) n Vj)). However, by definition of Gj, we know P(v) C Vj and 
S(v) C Vj. Hence, Y v satisfies the half-trek criterion with respect to v, in G. 

Proof of claim (b): Fix any two nodes u/iu. llj{y) = j{w), then by assumption, 
either v £ Y^ = Y w or w £ Y} "' = Y v . Now suppose j(v) =/= j(w), and w e Y v . 

Then w G Vj(v)\Cj(v)i implying that there exists a directed path from w to v in 
Gji v y Similarly, if v € Y w , then there is a directed path from v to w in Gji w y 
Since the directed part of G is acyclic, this is a contradiction. □ 
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Lemma 9. Suppose G is not HTC-infinite-to-one. Then Gj is not HTC-infinite- 
to-one for all j £ [k] ■ 

Proof. If G is not HTC-infinite-to-one, then there exists a family (Y v : v £ V) of 
subsets of V, such that for each v, Y v satisfies the half-trek criterion with respect 
to v, and for all v ^ w, either v (£ Y w or w £" Y v . 

Now fix any j £ [k] . For each v 6 Vj , we adopt the construction from the proof 
of Lemma [7] to obtain a system of half-treks H' v : Y^ =4 P(v) in Gj that shows that 
Yy satisfies the half-trek criterion with respect to v in Gj . For each v £ Vj \ Cj , v 
has no parents in G", and so we can define Y' v = 0. 

Now consider any v,w £ Vj. Suppose for a contradiction that v £ Y"/ u and 
w £ Y' v . This implies Y^,Y^ 7^ 0, and so v,w £ Cj. In this case, v is the source of 
some half-trek in 11^,. But then v £ Cj implies that this half-trek was unchanged 
when constructing 11^, and thus v is also in Y w . The same argument shows that 
w £ Y v , which contradicts the assumption made for our claim. □ 



12. Conclusion 

We have proposed graphical criteria for determining identifiability as well as non- 
identifiability of linear structural equation models. To our knowledge, our criteria 
are the best known. They apply to cyclic mixed graphs and, for acyclic graphs, the 
graph decomposition method discussed in Section [7] further extends their scope. It 
would be interesting to determine whether a similar graph decomposition method 
can be applied to cyclic graphs as well. Additionally, to better understand the 
"gap" between the necessary condition and the sufficient condition for rational 
identifiability that we have developed, we would also like to find some class of 
graphs, defined on an arbitrary number of nodes m, which is rationally identifiable 
but not HTC-identifiable. 

In models that are not HTC-identifiable, the half-trek method can still prove cer- 
tain parameters to be rationally identifiable; recall, for instance, the example from 
Figure [5F a). Referring to Theorem [11 if a set Y v satisfies the half-trek criterion with 
respect to the indexing node v, and Y v n H(v) = 0, then the proof of Theorem [I] 
shows how to obtain rational expressions in the covariance matrix S that equal the 
coefficients \ wv , where w £ P(v). In the next step of the recursive procedure that 
proves Theoremfl] we can solve for any node u with Y u DH(u) C {ti}. Continuing in 
this way, individual parameters can be identified even though ultimately the proce- 
dure will stop before all nodes are visited as we are discussing an HTC-inconclusive 
graph. It would be interesting to compare this partial application of the half-trek 
method to other graphical criteria for identification of individual edge coefficients; 
see in particular [GPSS10] for a review and examples of such methods. 

Applying our main results, Theorems[l]and[2l requires one to find sets that satisfy 
the half-trek criterion with respect to a considered node. In the related context of 
the G-criterion, Chapter 4 in the Ph.D. thesis [Bri04 formulates this problem as 
a computation of maximum flow in a network. Revisiting this construction in the 
context of our half-trek criterion would be useful for the treatment of larger graphs 
and an efficient computer implementation of the methods from this paper. 
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Appendix A. Algebraic techniques for proving and disproving 

IDENTIFIBABILITY 

In the proofs of our half-trek criteria we have made extensive use of the equa- 
tions 



2 I , namely, 

(A.l) [(I-A) T E(I- A)] vw = \/{v,w) ^B and v^ w. 

In this appendix we discuss the algebro-geometric and computational content of 
these equations. In what follows, we write A and u for tuples of variables repre- 
senting the non-zero entries of A and $7, and a for a tuple of variables representing 
the entries of E. We will also need the rational function 5 := det(I — A) -1 ; in the 
acyclic case, 5 is just the constant 1. The entries of the inverse (I — A) -1 lie in the 
ring R[A, S] of polynomials in A and S. The first observation is the following. 



Proposition 5. The left-hand sides of the equations (A.l) generate a radical ideal 
in the ring K[A, a, S] . 

We apply the theory of elimination ideals; see |CLO07( Chapter 3]. 



Proof. Recall the parametrization equation (8.1), namely. 



(A.2) E-(I-A)- T fi(I-A) _1 = 0, 

and interpret the individual entries of the matrix on the left-hand side as generators 
of an ideal J in R[A, cr, ui, 6]. The ideal J is radical because it is the ideal of the 
graph of the parametrization <f>Q. Let I be the ideal generated by the left-hand sides 



of (A.l ). We claim that I = J D K[A, a, S\; the fact that J is radical then implies 
that / is radical as well. 

The containment I C J D R[A, a, 5] is immediate because J contains the entries 



of the matrix obtained by multiplying (A.2 ) from the left by (/ — A) T and from the 



right by (/ — A) , and among these entries are the generators of I. 

For the converse, let / € J PI R[A, u, S\. Let (l be a symmetric matrix full of 
auxiliary new variables (Iiy = Wjj, even at positions that do not correspond to 
bidirected edges, and let g G R[A,a),<5] be the polynomial obtained from / by 
substituting the entries of 

(i - A)- T n{i - Ay 1 

for a. Since / lies in J, the polynomial g becomes zero when setting all variables 
Q uv with u 7^ v, (u, v) ^ B equal to zero. This means that g lies in the ideal 
generated by these variables, i.e., 

9 = ^ hi ■ io uv 

u^v,(u,v)£B 

for suitable coefficients hi £ TSL[X,ui,6]. Re-substituting (/ — A) T E(7 — A) for fi 
on the left-hand side yields /, and performing the same substitution on the right 
yields an R.[A, ct, S] -linear combination of the generators of /. It follows that I = 

JnR[\,a,6]. a 

We continue to write / and J for the two ideals featuring in the proof just given. 
In more geometric language, the proposition and its proof show that I is the ideal 
of all polynomials vanishing identically on the projection of the graph of (J)q into 
the principal open subset of (A, er)-space where 8 is defined. 



IDENTIFIABILITY OF LINEAR STRUCTURAL EQUATION MODELS 29 

Lemma 10. The parameter X uv is rationally identifiable if and only if I contains 
an element of the form a(a)X uv — b(a) with a,b G R[<r] and a not identically zero 
on the linear structural equation model given by the graph G. 

In fact, in this case b will not be identically zero on the model, either. 

Proof. By definition, if X uv is rationally identifiable, then there is a rational function 
b(a)/a(a) € R(er) which upon substituting for a the entries of (/ — A)~ T fl(I — A) 
becomes equal to X uv ; in particular, this substitution must be well-defined, so that 
a(o~) does not vanish identically on the model. This means that the polynomial 
a(o~)X uv ~ b(a) lies in the ideal J of the graph of the paramctrization. Since it only 
depends on A and a, it lies in / (see the proof of Proposition [5| . Conversely, if a, b 
are as in the lemma, then b/a is a rational function identifying X uv from a. □ 

Lemma [10] yields an algorithm for checking rational identifiability of a graph G 
that is very close to that of [GPSS10] . the main difference being that we use the 



equations in (A.l) rather than those in (A. 2) 



Algorithm 1 Check rational identifiability 



(1) Make a list S containing all matrix entries of the left-hand side of (A.l I, 
together with the additional polynomial 6 ■ det(I — A) — 1, in which S is 
treated as a variable. 

(2) Choose a block monomial order > on the monomials in the variables A, a, 5 
with S > A > a; that is, when comparing two monomials, first compare the 
exponents of S, and in case of a tie compare the A-parts of the monomials, 
and in case of a tie compare the c-parts. 

(3) Compute a reduced Grobner basis T with respect to > of the ideal I gen- 
erated by S. 

(4) Then G is rationally identifiable if and only if for each (u, v) £ D the basis 
T contains an element whose leading monomial equals a monomial in a 
times X uv . 



Correctness of the algorithm. If T contains a polynomial f uv whose leading mono- 
mial equals X uv times a monomial in a, then f uv is of the form a{a)X uv — b(a, A), 
where b only contains A- variables smaller than X uv . Moreover, a does not vanish 
identically on the model (or else a would be in / and hence f uv would not be re- 
duced). Therefore, X uv can be rationally identified if all smaller A-variables can. 
Hence, if we assume that T contains such a polynomial for all (u, v) e D, then G 
is rationally identifiable. 

Conversely, if X uv is rationally identified by b(o~)/a{o~), then a(o~)X uv — b(a) G 
/ by Lemma |10| Replace a by its reduction modulo T; this reduction is non- 
zero since a does not vanish identically on the model, and it contains only the 
variables a because of the choice of monomial order. Now the leading monomial of 
a(o~)X uv — b(o~) equals X uv times the leading monomial of a, and it is divisible by 
the leading monomial of some element / of T. Then / has leading monomial X uv 
times some monomial in a, as required. □ 

The reduced Grobner basis T contains more information than is used in Step (4) 
of the algorithm just described. Indeed, straightforward modifications of Step (4) 
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Figure 8. Cyclic graph exemplifying the need to remove points 
(A, O) with det(I — A) = in algebraic computations. 



can be used to test whether the parametrization is generically finite-to-one, and to 
find the degree of idcntifiability ID(G). 

For large-scale computations such as those in Section [6j the presented algorithm 
is too involved. Instead, we used a randomized version in which the variables a 
are replaced by the numerical values of the entries of randomly chosen matrices in 
the model. In other words, for random choices of A G H.™ and il E PD(B), we 
compute a reduced Grobner basis for the equation system 

(A.3) [(I-A) T G (A o ,n o )(I-A)] vw = O V(v,w)$Bmdv^w, 

(A.4) 8 ■ det(7 - A) - 1 = 0, 

under a block monomial order with 8 > X. The reduced Grobner basis informs 



us about the dimension and cardinality of the solution set of (A.3) and (A.4) over 
C^L x C™^ m , and readily yields the degree of identifiability ID(G). In particular, 
the basis corresponds to a linear equation system with unique solution if and only if 
the graph is rationally identifiable. Formally, the claims in the last sentences hold 
with probability one, if (Ao, f2o) is drawn from a continuous probability distribution. 
In practice, we generate random integer-valued matrices that are then processed in a 
computer algebra system such as Singular [DGPSllj . To guard against occasional 
false conclusions from random draws that yield matrices in special position, we 
repeat the randomized calculation several times for each graph. 



Finally, we stress with our last example that the equation (A.4) cannot be omit- 
ted when studying cyclic graphs, even when Ao is chosen to be in R^L. 

Example 9. For the graph G in Figure^ a run of Algorithm 1 without specializing 



values shows that the ideal in R[A, a] generated by the equations (A.l) contains 
elements CI12A12 +&12, 114^14 + 614, Q.23A23 + &23 with the dij,bij polynomials mK[ir] 
not vanishing identically on the model, but it does not contain similar elements 
041 A41 + 641 or 031 A32 + 632- Furthermore, running a Grobner basis computation 



on the fiber equations (A.3) with randomly specialized values yields that the fiber has 
multiplicity 3, and hence the ideal generated by these equations is not radical. Both 
of these issues disappear when introducing the auxiliary variable 8 and imposing 
8 ■ det(/ — A) — 1 = 0, with £ specialized in the second case. Then the algorithm 
proves that G is rationally identifiable. In fact, G is BTC -identifiable, because 
Theorem [7] applies with the ordering 1 <"& <\ < \ <§ and the sets 

y 2 = {i,4}, r 3 = {2}, 1-1 = {3}, r 4 = {i}, r 5 = 0. 
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